Instructions

For each question below, show code. Once you’ve completed things, don’t forget to to upload this document (knitted version please!) to Blackboard!

A few tips:

25 points total + 2 points extra-credit


1. Use tidycensus to download 1. race/ethnicity (B03002) and 2. median household income for Baltimore City. Store this data in a new object. Choose which race/ethnicity you’d like to relate to income (Non-Hispanic Black and Non-Hispanic White work best). Which census tract has the highest percentage of your target race/ethnicity (and what is the percent) and which has the highest median household income (and how much is it?)? (5 points) Reminder: Since we will be mapping our data, make sure you include use geometry = TRUE in get_acs()

  1. I used the ACS 2019 5-year estimate survey to download the median household income, total population, and non-hispanic white population for Baltimore City. The census tract with the highest percentage of non-hispanic white population is Census Tract 2404, with the percentage of white population comprising 91.8% of the population in this tract. The census tract with the highest median household income is Census Tract 2711.02, with a household median income of $195,156.
# use tidycensus to obtain tract data for total pop, non-hispanic white pop, and household median income
md_race_income_2019 <- get_acs(geography = "tract",
                            variables = c("non_his_white" = "B03002_003", # non-hispanic white population
                                          "total_pop" = "B03002_001", # total population
                                          "med_hh_inc" = "B19013_001" # Median household income
                                          ),
                            year = 2019,
                            state = c(24), # Maryland
                            county = c(510), # Baltimore City
                            survey = "acs5", # 5 year survey
                            geometry = TRUE, #to download for mapping
                            output = "wide")
## Getting data from the 2015-2019 5-year ACS
# create a new variable to append to our data representing percent white population per tract
md_race_income_2019$whitepct <- (md_race_income_2019$non_his_whiteE / md_race_income_2019$total_popE) * 100

# Get a general idea of values for each column to narrow down filtering for max function
head(md_race_income_2019)
## Simple feature collection with 6 features and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -76.655 ymin: 39.22373 xmax: -76.54867 ymax: 39.3311
## Geodetic CRS:  NAD83
##         GEOID                                           NAME non_his_whiteE
## 1 24510260202 Census Tract 2602.02, Baltimore city, Maryland            254
## 2 24510150100    Census Tract 1501, Baltimore city, Maryland             86
## 3 24510250303 Census Tract 2503.03, Baltimore city, Maryland           1117
## 4 24510180100    Census Tract 1801, Baltimore city, Maryland              5
## 5 24510020300     Census Tract 203, Baltimore city, Maryland           2855
## 6 24510250402 Census Tract 2504.02, Baltimore city, Maryland           1314
##   non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1            110       5681        637       41538        7943
## 2             80       2172        383       17371       10440
## 3            227       2229        274       38519       11718
## 4              8       2185        267       15195        5979
## 5            374       3761        416      108026       22478
## 6            283       4614        466       37451       12822
##                         geometry  whitepct
## 1 MULTIPOLYGON (((-76.56596 3...  4.471044
## 2 MULTIPOLYGON (((-76.64501 3...  3.959484
## 3 MULTIPOLYGON (((-76.655 39.... 50.112158
## 4 MULTIPOLYGON (((-76.63413 3...  0.228833
## 5 MULTIPOLYGON (((-76.60051 3... 75.910662
## 6 MULTIPOLYGON (((-76.60817 3... 28.478544
md_race_income_2019 <- md_race_income_2019 %>%
  filter(total_popE > 0)

# filter percent white pop to determine the max value 
md_race_income_2019 %>%
  filter(whitepct > 80) %>% # use values above 80% to narrow down our search
  filter(whitepct== max(whitepct))
## Simple feature collection with 1 feature and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -76.61192 ymin: 39.25471 xmax: -76.59241 ymax: 39.27298
## Geodetic CRS:  NAD83
##         GEOID                                        NAME non_his_whiteE
## 1 24510240400 Census Tract 2404, Baltimore city, Maryland           2498
##   non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1            201       2721        190      110982       11576
##                         geometry whitepct
## 1 MULTIPOLYGON (((-76.61192 3... 91.80448
# filter medium household income to determine max value
md_race_income_2019 %>% 
  filter(med_hh_incE > 30000) %>% # use values above 30000 to narrow our search
  filter(med_hh_incE== max(med_hh_incE))
## Simple feature collection with 1 feature and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -76.62923 ymin: 39.33853 xmax: -76.60958 ymax: 39.35387
## Geodetic CRS:  NAD83
##         GEOID                                           NAME non_his_whiteE
## 1 24510271102 Census Tract 2711.02, Baltimore city, Maryland           2724
##   non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1            440       4597        501      195156       48413
##                         geometry whitepct
## 1 MULTIPOLYGON (((-76.62923 3... 59.25604

2. Please reproject this data to Web Mercator. (1 points)

  1. Used st_crs on initial variable to check the original projection: NAD83. I then used st_transform and stored the newly projected data in a new variable. I then used st_crs a second time on the new variable to ensure the reprojection to Web Mercator took place effectively.
# check initial projection (should be NAD83)
st_crs(md_race_income_2019)
## Coordinate Reference System:
##   User input: NAD83 
##   wkt:
## GEOGCRS["NAD83",
##     DATUM["North American Datum 1983",
##         ELLIPSOID["GRS 1980",6378137,298.257222101,
##             LENGTHUNIT["metre",1]]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433]],
##     CS[ellipsoidal,2],
##         AXIS["latitude",north,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433]],
##         AXIS["longitude",east,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433]],
##     ID["EPSG",4269]]
# use st_transform to reproject
md_race_income_2019_webmerc <- st_transform(md_race_income_2019, crs = 3857)

# check to make sure the transform worked
st_crs(md_race_income_2019_webmerc)
## Coordinate Reference System:
##   User input: EPSG:3857 
##   wkt:
## PROJCRS["WGS 84 / Pseudo-Mercator",
##     BASEGEOGCRS["WGS 84",
##         DATUM["World Geodetic System 1984",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4326]],
##     CONVERSION["Popular Visualisation Pseudo-Mercator",
##         METHOD["Popular Visualisation Pseudo Mercator",
##             ID["EPSG",1024]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["False easting",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Web mapping and visualisation."],
##         AREA["World between 85.06°S and 85.06°N."],
##         BBOX[-85.06,-180,85.06,180]],
##     ID["EPSG",3857]]

3. Create two plots. In the first plot highlight the tract with the highest concentration of your selected race/eth. In the second plot highlight the tract with the highest median household income? (5 points)

# plot tract with highest % white population over the full Balt City map
md_race_income_2019_webmerc %>% 
  filter(whitepct > 80) %>% 
  filter(whitepct== max(whitepct)) %>%
  ggplot() + 
  geom_sf(data = md_race_income_2019_webmerc) +
  geom_sf(aes(fill = whitepct), lwd = NA) + 
  theme_dark() + 
  ggtitle("Census Tract in Baltimore City with highest percentage of White Population") +
  theme(plot.title = element_text(size = 11.5)) +
  xlab("Longitude") +
  ylab("Latitude")

# plot tract with the highest median household income over the full Balt City map
md_race_income_2019_webmerc %>% 
  filter(med_hh_incE > 30000) %>% 
  filter(med_hh_incE== max(med_hh_incE)) %>%
  ggplot() + 
  geom_sf(data = md_race_income_2019_webmerc) +
  geom_sf(aes(fill = med_hh_incE), lwd = NA) + 
  theme_dark() + 
  ggtitle("Census Tract in Baltimore City with highest Median Household Income") +
  theme(plot.title = element_text(size = 11.5)) +
  xlab("Longitude") +
  ylab("Latitude")

4. Create a third column using the bi_class function from the tutorial. (2 points)

# create classes using call for bi_class function of dimension 3 with jenks break 
bi_md_race_inc <- bi_class(md_race_income_2019_webmerc, x = whitepct, y = med_hh_incE, style = "jenks", dim = 3)
## Warning in classInt::classIntervals(bins_y, n = dim, style = "jenks"): var has
## missing values, omitted in finding classes
# use table to ensure classes were properly allocated
table(bi_md_race_inc$bi_class)
## 
##  1-1  1-2 1-NA  2-1  2-2  3-2  3-3 
##   82   27    1   22   34   13   20

5. Create a bivariate map with your data. (3 points)

# store plot in a new variable `map` to call later when plotting with legend
map <- ggplot() +
  geom_sf(data = bi_md_race_inc, mapping = aes(fill = bi_class), color = "white", size = 0.1, show.legend = FALSE) + 
  bi_scale_fill(pal = "DkBlue", dim = 3) +
  labs(
    title = "Race and Income Baltimore City, MD",
    subtitle = "Dark Blue (DkBlue) Palette"
  ) +
  bi_theme()

# plot the new map
plot(map)

6. Use the cowplot package and ggdraw, like in the tutorial to add a legend (2 points).

# create bivaraite legend and store in new variable to plot with map 
legend <- bi_legend(pal = "DkBlue",
                    dim = 3,
                    xlab = "Higher % White ",
                    ylab = "Higher Income ",
                    size = 7)

# combine map with legend into new variable
finalPlot <- ggdraw() +
  draw_plot(map, 0, 0, 1, 1) +
  draw_plot(legend, 0.22, .05, 0.25, 0.25)

# plot variable containing map and legend
plot(finalPlot)

7. Rinse and repeat for another county of your choosing, using a different color scheme. Be sure to use Psuedo-Mercator (3857). (5 points)

# use tidycensus to obtain tract data for total pop, non-hispanic white pop, and household median income for Calvert County 
cc_race_income_2019 <- get_acs(geography = "tract",
                            variables = c("non_his_white" = "B03002_003", # non-hispanic white population
                                          "total_pop" = "B03002_001", # total population
                                          "med_hh_inc" = "B19013_001" # Median household income
                                          ),
                            year = 2019,
                            state = c(24), # Maryland
                            county = c(9), # Calvert County
                            survey = "acs5", # 5 year survey
                            geometry = TRUE, #to download for mapping
                            output = "wide")
## Getting data from the 2015-2019 5-year ACS
# create a new variable to append to our data representing percent white population per tract
cc_race_income_2019$whitepct <- (cc_race_income_2019$non_his_whiteE / cc_race_income_2019$total_popE) * 100

# Get a general idea of values for each column to narrow down filtering for max function
head(cc_race_income_2019)
## Simple feature collection with 6 features and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -76.70196 ymin: 38.3751 xmax: -76.4811 ymax: 38.76928
## Geodetic CRS:  NAD83
##         GEOID                                           NAME non_his_whiteE
## 1 24009860702 Census Tract 8607.02, Calvert County, Maryland           1902
## 2 24009860102 Census Tract 8601.02, Calvert County, Maryland           2672
## 3 24009860101 Census Tract 8601.01, Calvert County, Maryland           2445
## 4 24009860501 Census Tract 8605.01, Calvert County, Maryland           5308
## 5 24009860801 Census Tract 8608.01, Calvert County, Maryland           5280
## 6 24009860402 Census Tract 8604.02, Calvert County, Maryland           2700
##   non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1            275       3256        345       82006       15372
## 2            221       2990        238      113188       17489
## 3            263       3159        210      135375       14592
## 4            426       7020        412      149563       14119
## 5            547       6900        624      126081       17096
## 6            368       3599        428       99779       15300
##                         geometry whitepct
## 1 MULTIPOLYGON (((-76.60506 3... 58.41523
## 2 MULTIPOLYGON (((-76.66229 3... 89.36455
## 3 MULTIPOLYGON (((-76.70121 3... 77.39791
## 4 MULTIPOLYGON (((-76.61466 3... 75.61254
## 5 MULTIPOLYGON (((-76.60889 3... 76.52174
## 6 MULTIPOLYGON (((-76.56918 3... 75.02084
cc_race_income_2019 <- cc_race_income_2019 %>%
  filter(total_popE > 0)

# filter percent white pop to determine the max value 
cc_race_income_2019 %>%
  filter(whitepct > 80) %>% # use values above 80% to narrow down our search
  filter(whitepct== max(whitepct))
## Simple feature collection with 1 feature and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -76.471 ymin: 38.36306 xmax: -76.38148 ymax: 38.45049
## Geodetic CRS:  NAD83
##         GEOID                                           NAME non_his_whiteE
## 1 24009861001 Census Tract 8610.01, Calvert County, Maryland           1056
##   non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1            122       1148        112      144219       22239
##                         geometry whitepct
## 1 MULTIPOLYGON (((-76.47093 3... 91.98606
# filter medium household income to determine max value
cc_race_income_2019 %>% 
  filter(med_hh_incE > 30000) %>% # use values above 30000 to narrow our search
  filter(med_hh_incE== max(med_hh_incE))
## Simple feature collection with 1 feature and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -76.701 ymin: 38.63048 xmax: -76.59602 ymax: 38.72666
## Geodetic CRS:  NAD83
##         GEOID                                        NAME non_his_whiteE
## 1 24009860200 Census Tract 8602, Calvert County, Maryland           4890
##   non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1            361       6138        361      155548       23827
##                         geometry whitepct
## 1 MULTIPOLYGON (((-76.70067 3... 79.66764
# check initial projection (should be NAD83)
st_crs(cc_race_income_2019)
## Coordinate Reference System:
##   User input: NAD83 
##   wkt:
## GEOGCRS["NAD83",
##     DATUM["North American Datum 1983",
##         ELLIPSOID["GRS 1980",6378137,298.257222101,
##             LENGTHUNIT["metre",1]]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433]],
##     CS[ellipsoidal,2],
##         AXIS["latitude",north,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433]],
##         AXIS["longitude",east,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433]],
##     ID["EPSG",4269]]
# use st_transform to reproject
cc_race_income_2019_webmerc <- st_transform(cc_race_income_2019, crs = 3857)

# check to make sure the transform worked
st_crs(cc_race_income_2019_webmerc)
## Coordinate Reference System:
##   User input: EPSG:3857 
##   wkt:
## PROJCRS["WGS 84 / Pseudo-Mercator",
##     BASEGEOGCRS["WGS 84",
##         DATUM["World Geodetic System 1984",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4326]],
##     CONVERSION["Popular Visualisation Pseudo-Mercator",
##         METHOD["Popular Visualisation Pseudo Mercator",
##             ID["EPSG",1024]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["False easting",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Web mapping and visualisation."],
##         AREA["World between 85.06°S and 85.06°N."],
##         BBOX[-85.06,-180,85.06,180]],
##     ID["EPSG",3857]]
# create classes using call for bi_class function of dimension 3 with quantile break 
bi_cc_race_inc <- bi_class(cc_race_income_2019_webmerc, x = whitepct, y = med_hh_incE, style = "quantile", dim = 3)

# use table to ensure classes were properly allocated
table(bi_cc_race_inc$bi_class)
## 
## 1-1 1-2 1-3 2-1 2-2 2-3 3-1 3-2 3-3 
##   3   2   1   1   1   4   2   3   1
# store plot in a new variable `cc_map` to call later when plotting with legend
cc_map <- ggplot() +
  geom_sf(data = bi_cc_race_inc, mapping = aes(fill = bi_class), color = "white", size = 0.1, show.legend = FALSE) +
  bi_scale_fill(pal = "DkViolet", dim = 3) +
  labs(
    title = "Race and Income Calvert County, MD",
    subtitle = "Dark Blue (DkViolet) Palette"
  ) +
  bi_theme()

# create bivaraite legend and store in new variable to plot with map 
cc_legend <- bi_legend(pal = "DkViolet",
                    dim = 3,
                    xlab = "Higher % White ",
                    ylab = "Higher Income ",
                    size = 7)

# combine map with legend
cc_finalPlot <- ggdraw() +
  draw_plot(cc_map, 0, 0, 1, 1) +
  draw_plot(cc_legend, 0.22, .05, 0.25, 0.25)

# plot final map
plot(cc_finalPlot)

8. Write the bi_class output to a geojson file. (1 points)

# Write original object out as geojson file to open in qgis
st_write(bi_cc_race_inc, "bi_cc_race_inc.geojson", append = TRUE)
## Updating layer `bi_cc_race_inc' to data source `bi_cc_race_inc.geojson' using driver `GeoJSON'
## Updating existing layer bi_cc_race_inc
## Writing 18 features with 10 fields and geometry type Multi Polygon.

9. Now open your geojson output and create a QGIS map of your bivariate map. Put an image of that map here. (2 points)

Bivariate Map

Bivariate Map

10. Use qgis2web and put a link here to your github site with the webmap of your bivariate map. (3 points)

Link to Github Page Below:

https://mfick1.github.io/

3. Reflection (3 points)

This reflection is free-form. Use this as a space to note:

This lab greatly enhanced my knowledge of how to define and call multiple map features into the same plot through R. In previous labs, the legend was always a function of the initial map that was defined as a quality of the data during plotting, but in this lab, we created the bivariate legend separately from the map itself and called them together to display in the same image. This allows for much greater flexibility and customization of other features of the map that aren’t simply just displaying the data. This will be useful knowledge as I begin to create more and more complex maps over time using R. Prior to this lab I also only new how to generate a bivaraite map through multiply of two datasets in QGIS which can be a bit more finicky and harder to define classes. In R this process seems to allow for more ease in adjusting values as desired for an overall more effective map.

4. Extra Credit (2 points)

Put an image as a legend in your web map.

(See webmap)

Knit your document to a .html file. Submit this knitted document.